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Abstract 

It is well known that the coefficients in Faa di Bruno's chain rule for re-th derivatives 
of functions of one variable can be expressed via counting of partitions. It turns out that 
this has a natural form as a formula for the vector case. Viewed as a purely algebraic 
fact, it is briefly "explained" in the first part of this note why a proof for this formula 
leads to partitions. In the rest of the note a proof for this formula is presented "from 
first principles" for the case of n-th Frechet derivatives of mappings between Banach 
spaces. Again the proof "explains" why the formula has its form involving partitions. 



Introduction 

O , , 

\^ If / is a finite set, |/| will denote its cardinality. N will denote the set of non- negative integers. 
O V{I) is the power set of /. 

^— > ■ Faa di Bruno's formula (see [J] for an excellent survey and bibiliography) gives a somewhat 
^ ', complicated expression for {go f)''"'\x) , f and g functions of one variable, in terms of derivatives 
up to order n of / at x and of g at f{x). The coefficients can be expressed using numeration 
of partitions of a set of n elements (see below). 

It might seem that this settles the problem for mappings between multi-dimensional spaces 
5^ , X: the existence of the n-th derivative for the composite mapping follows from 1-st derivative 
theorems, the n-th derivatives in the vector case being treated as iterated 1-st derivatives, and 
the formula for it is obtained from Faa di Bruno's formula for one variable by composing with 
linear mappings X — >■ R or R — ^ X. Moreover, since it is clear, by iterating the 1-st derivative 
chain rule, that {g o f)^"'\x) is a certain polynomial in f^''\x) and g^''\f{x)), 1 < k < n, the 
problem is only to find its form, a purely algebraic problem, and in this way it is treated in 
the literature. 

It turns out, however, that writing the formula for mappings between vector spaces gives 
it a very natural form, obscured by the particulars of the 1-dimensional case. This chain-rule 
formula for n-th derivatives of mappings between vector spaces is constructed in terms of 
partitions of a set I of cardinality n, as follows: 
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INTRODUCTION 



Where f:X^Y,g: Y^Z,X,Y,Z are vector spaces, x,Vi G X, / is a set with n 
elements and 'H{I) is the set of partitions of /, i.e. H{I) is the subset of V{V{I)) consisting 
of all disjoint collections of subsets of / whose union is / and that do not contain the empty 
set. Thus, for example: 

7i(0) = {0} 

^({1})= {{1}} 

7i({l,2})= {{{1},{2}},{{1,2}}} 

2, 3}) = { {{1}, {2}, {3}} , {{1}, {2, 3}} , {{2}, {1, 3}} , {{3}, {1, 2}} , {{1, 2, 3}} } 

For the one-dimensional case X — Y — Z — the scalars it suffices to let Vi — 1 and (3) 
reads: 

{{9oft\x\l)= E (^(WH/(x)),(g)^^^(/(l^l)(x),l)), 

7reH(/) 

that is: 

{9oft\x)= y: ^^''^'H/i^)) n z^'^'h^), (3.1') 

7reW(7) Sen 

a form of Faa di Bruno's formula. 

If (3.1) is viewed as a purely algebraic statement, one can deduce it in a straightforward, 
though abstract, manner (see §1), where the role of partitions is "explained". 

In Theorem 1. in §3, though, (3.1) is proved "from basic principles" for Frechet derivatives 
between Banach spaces. Once guessed, (3.1) can be proved in a straightforward manner using 
induction. We have preferred to present a different proof which explains directly the form of 
the formula. To this end the derivatives are treated via n-th iterated differences, i.e. alternating 
signs sums over vertices of n-dimensional parallelepipeds, rather than iteratively. The formula 
is a consequence of an identity (Lemma 2) involving such sums. This identity and the proof 
are best expressed in the language of free linear spaces and free commutative algebras, which 
is therefore introduced. Also, this approach to the n-th derivative allows one to require just 
their existence in the strict sense (a definition is given in §2) at the particular relevant points. 
A slight complication arises from the fact that the identity of Lemma 2 involves sums over 
parallelepipeds of dimensions higher than n, the order of the highest participating derivative. 
This is dealt with using Lemma 1. 

In Bourbaki, Varietes ([B]) the linear space of point distributions of order < n at a point 
in a C^"^-manifold is defined and its properties stated. The fact that this is properly defined 
and is a functor may be proved using formula (3.1). Thus one may say that in spirit, (3.1) 
is present in [B]. 
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1 Where Do Partitions Come From? — A Purely Alge- 
braic Approach 

In this section the scalars K may be any field. We insist on avoiding division by integers, thus 
the scalar field may have any characteristic. 

Let us be abstract. The mappings will be between "germs of n-manifolds at a point" , in 
short n-germs. defined by their function algebra, namely A = An = -^[[^1, . . . , T^]] (formal 
power series), where the Tj are indeterminates. A has a linear topology, taking as basic 
neighborhoods at the positive integer powers of the ideal consisting of the formal power 
series without constant term. Smooth mappings / from the n-germ to the m-germ are defined 
by continuous homomorphisms 0/ : Am An- (The "actual" m components of / as a 
mapping are the "functions" (f)f{Ti), i = 1, . . . , m.) The continuity of 0/ means just that any 
coefficient of (f){F) is only a function of a finite number of coefficients of F. Note that, by 
the continuity, since in the linear topology of A T!^ ^fe^oo also {(t)f{Ti))'^ ^fc^oo 0, hence 
4>f{Ti) have no constant term - indeed /(O) = 0. 

Let A! be the "dual", i.e. the vector space of continuous functionals ^ : A ^ K, K 
given the discrete topology. Continuity again means that i{F) depends only on a finite set of 
coefficients of F. 

The multiplication on A induces a comultiplication A : — > A!® A! by 

(A(0)(F®G) := i{FG) F,G e A. (1.1) 

This comultiplication, with the counit ^ i— > ^(1), turns A! into a (coassociative and cocommu- 
tative) coalgebra over K. A! contains the Dirac delta 5 at 0, defined by 5{F) := the constant 
term in F, which satisfies A((5) = 5®5. We also have the primitive elements in A!, i.e. those 
satisfying A(,^) = ^®S + 5®^, which means by (1.1) that the functional ^ is a "tangent vector" 
at 0. Indeed, they constitute a vector space over K isomorphic to by the isomorphism 
{ H- > (^(Ti), . . . ,^(r„)). Denote this "tangent space" of the n-germ hy V = Vn- We identify a 
"vector" in V with the corresponding "vector" in K^. 

But there is more structure: one may define the convolution of two elements ^, 77 of A! 
(giving an element of A!), and the convolution of an element ^ of A! and a F e ^ (giving an 
element of A) by: 

{i*'n){F) := i{T ^ rj{S ^ F{S+T))) ^*F := (T ^ ^{S ^ F{S+T))) F e A. (1.2) 

The first convolution makes A' also into an (associative and commutative) algebra, with unit 
5, which acts linearly on A by the second convolution. In particular, as one easily sees, the 
primitive elements v E V act on A as the directional partial derivatives with direction v. 
Indeed, one may write: 

{F',v)^v*F v{F) ^ {F'{0),v) veV,FeA. 

Convolutions of elements v in V will act, by convolution with members of A, as composition 
of the actions of the v, i.e. by higher-order differential operators. One finds that one may also 
write: 

{vi*---*Vk){F) = [F^''\0),vi(S)---®Vk) VieV,FeA. (1.3) 



4 1 WHERE DO PARTITIONS COME FROM? - ALGEBRAIC APPROACH 



Indeed: 

{F^''\O),Vi0---®Vk) = {F^''\vi®---0Vk)\o^ 

^ {Vi * ■ * Vn * F)\o = {Vi * ■ ■ ■ * Vn * S){F) = {Vi * ■ ■ ■ * Vn){F) 



Yet there is a difference between the comultiphcation and the convolution multiphcation: 
since any smooth mapping / defines a homomorphism 0/ of the algebra A, it will induce a 
dual map on A' (which we again denote by (pf) which will preserve the comultiphcation, i.e. 
0/(A((^)) = A{(f)f{^)), (Applying a mapping such as ^/ to the tensor product is understood 
factorwise.) Thus, the coalgebra structure is an invariant of the manifold structure, so to 
speak. On the other hand the convolution was defined in (1.2) using the additive structure 
of K"^, so to speak, and a smooth mapping need preserve it only if it is linear. Nevertheless, 
using the fact that the convolution in A' is, in a sense, the map A' A'® A' induced by the 
smooth mapping x — > given by addition, one shows that convolution preserves the 
comultiphcation, hence A' is a bialgebra, (it turns out to be a Hopf algebra). 

This can be used to compute A on *i^iVi, Vi E V, J a finite set. (Recall that 6 is the unit 
of convolution): 

A (*ieiVi) = *ieiA{vi) = *ia{vi®S + S^Vi) = ^ {*iesVi) ^ (*iei\sVi) . (1.4) 

5c/ 



One defines A^^) : A' A'<S>A'<S>A' by A^^) := (Aoid) o A = (id® A) o A (the latter 
being equal by coassociativity) , and one has A(^)(^) {F<SiG<SiH) — ^{FGH). Similarly for 

A("), n e N. For V eV one has A^^'^v = v(^5(S>S + 5(S>v(S>S + S(S)S(S)V etc. In analogy to (1.4), 
one has, ior Vi eV, la, finite set: 



E 



=disj . u*=^ 



1^3 



[1.5) 



Let / be a smooth mapping from the n-germ to the m-germ. Applying (1.3) for F — fj — 
4>f{Tj) one obtains (recall that f^''\0) is a linear map S''{Vn) — > \4i - for the notation S''{V) 
see §2.1): 



[0/(^1 * • • • * Vk)] (Tj) = [{f'\0), v,<»--- (»Vk)] (T,) = [(/('=)(0), ■ • • (»Vk 
We claim that, for Vi eV, la finite set (for the notation 'H{I) see §0): 



ViEV j = l,. 
(1.6). 

(1.7) 



To get (1.7), it suffices that both sides give the same value when applied to a monomial on 
the Tj's, that is, that applying A'^'^'^ to both sides one obtains elements of the k-th. tensor 
power of A' that give the same value when applied to tensors of T/s. But that follows in a 
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straightforward (though somewhat clumsy) manner from (1.5) and (1.6) and the fact that 0/ 
preserves the comultiphcation. 

Now (3.1) follows from (1.7) and (1.6): indeed, if / and g are smooth mappings, Vi G V, 
I a finite set and j = 1, . . . ,n where n is the dimension of the image germ of g: 



2 Preliminaries to the Banach Space Approach 

Let X, Y, Z denote linear spaces over a field (the scalar field) - from §3 on Banach spaces 
unless otherwise stated. The choice of R or C as scalar field is insignificant and will not be 
mentioned. 

2.1 Symmetric Tensors 

We consider the symmetric power S"{X) of X, defined as the quotient space of the n-fold 
(algebraic) tensor product with respect to its subspace spanned by the elements 



where Xi E X and a is a permutation of {1,2,..., n}. The image of a tensor xi® . . . <S>Xn 
in S'"(X) by the quotient map will still be denoted by . . . Thus, (xi, . . . , x„) 

Xi® . . . € S^{X) is multilinear and symmetric and one may freely employ the notation 
<S)iGi^i fo'^ ^ finite family (xj)ie/. 

In particular, S^{X) may be identified with the scalar field. 

In the direct sum S{X) :— 0^5'"'(X), ® is an associative and commutative multiplica- 
tion. S{X) is called the symmetric algebra of X. 

When X and Y are Banach spaces, the bounded hnear operators T : S"{X) — > Y, i.e. 
those satisfying 



form, with the above norm, a Banach space, to be denoted by C^^''\X,Y). Of course, this 
space may be identified with the Banach space of bounded symmetric n-multilinear operators 



{{g o /)(i^i)(o), <S),^,v,)]^ = [MM*i^iVi))] (Tj) 



\7reW(/) 




Xii^X20 . . . ®Xn - X„(i)®X„{2)® . . . ®X„(ri) 



||r|| sup \{T , Xi® . . . ®Xn)\ < oo 
l|a;i|l<l 



from X" to Y. 
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2 PRELIMINARIES TO THE BANACH SPACE APPROACH 



2.2 The n-th strict (Frechet) derivative 

It may be defined, instead of iteratively, via n-th order differences (or, in other words, alter- 
nating signs sums over vertices of parallelepipeds), as follows: 

Definition 1 Let X, Y be Banach spaces, U(ZX,f: U^Y,xeU° (the interior of U). 
Let n & N. We say that f has a (strict) n-th (Frechet) derivative at x, the derivative being 
the element /("^ of C^^\X,Y), if f'^'^^ satisfies, for a set I with \I\ — n: 

s<zi V ies / Vie/ / 

Such an /'^"^ is necessarily unique, if it exists. 

Defined this way, (x) may exist even if previous derivatives fail to exist in a neighbor- 
hood of X or even at x itself (for example any non- continuous linear operator / has /'•"^ = , 
n > 2). 

Note that f^^\x) = y means that / is continuous at x and f{x) = y. 
The proof of the following fact is standard: 

Fact: Suppose f^'^\x) exists for every x in an open U <Z X. Then: 

(i) : U /:(")(X, Y) is continuous 

(ii) If v e [/ , e X are such that the " parallelepiped" 

{v + Y. ^i^i : i = (^1, ■ ■ ■ , in) e [0, l]*^} C U 

then (|J| = n): 

dt being the n-dimensional Lebesgue measure. 

(iii) (consequence of (ii) ): If x e f/ , m e N then /("^ has a (strict) m-th derivative at x iff 
/ has a (strict) m-\-n -th derivative at x and then: 

(iii) implies that the above definition of f^'^\x) coincides with the iterative definition 
whenever , m < n exist in a neighborhood of x. 
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3 The Chain Rule for the n-th Frechet Derivative 

Theorem 1. LetneN,/: UcX^Y,g: VcY^Z (X,Y,Z Banach spaces) 

be such that f{U) C V. Let x E U° (interior of U), he such that f^^^ , < k < n exist at 
X, f{x) G V and g^''^ , < k < n exist at f{x). Then {g o f)^^\x) exists and we have the 
formula (\I\ = n): 

{{9ofY''\x),(S},^^v,)^ j: (^^wH/(^)),(8)5,,(/^i"H^),(8),,^^i)) (3.1) 

All derivatives are in the sense of Definition 1. 

Proof. Let us remark first, that if it is assumed that the derivatives of order < n exist 
in neighborhoods of x and f{x) and one employs the iterative definition, then (3.1) may 
be proved by induction in a straightforward manner (assuming one guesses the formula in 
advance). We shall rather give a proof that assumes existence of the derivatives only at x and 
f{x) using Definition 1, and moreover "explains" why (3.1) has its form. 

Before we proceed with the proof, let us introduce the following auxiliary concepts and 
notations: 

Assume, for the moment, that X is just a set. Denote by £X the free linear space with basis 
denoted by {^x)^^-^. In this way, £^ is a functor: for any sets X, Y and function f : X ^ Y 
we have a linear function : £X £Y defined by (Sx) = £ (/(x)). In case F is a linear 
space, one has also a hnear : £X Y defined by {£x) — f{x). 

For a linear space X, EX is a commutative algebra where Sx-Sy = S{x + y) , i.e. X is the 
group-algebra of the additive group X. It has the unit element 1 = SO. For linear f : X ^ Y, 
f^ is an algebra homomorphism preserving unit element. 

S may be iterated: for any set X we have the commutative algebra SSX. 

For the case that X,Y,Z are linear spaces and f : X ^ Y , g : Y Z any functions, we 
have the following formulas: 

fix) = (Ex) (3.2) 

£f(x) = (Ex) = /^^ (ESx) (3.2') 

gfix) = / {£f{x)) = //^^ {£Sx) (3.2") 
Note that /^^ is always an algebra homomorphism. 

Let us rewrite the left-hand side expression of (2.1) (definition 1) using these notations: 

Esci (-l)l'l-'^l/ (v + E^esv^) = Esci (- 1 ) l^^^l {£ {v + E.e5^.)) 

= (E5c/ {-l)\'\-\'\£vY{^^s£v,) = f^ {£vY{^a {^Vr ' 1)) ■ 

Thus (2.1) takes the form: 

{£v\{{£v. - l)\ = (/('^)(x),(g).^^^,) + o,_,,,,_o fn ll^^lll (3-3) 
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3 THE CHAIN RULE FOR THE N-TH FRECHET DERIVATIVE 



Lemma 1 If f^^^ exists but one takes \I\ > n, then for any J <Z I, \ J\ = n, the left hand-side 
of (3.3) is Ov^x,vi^o {UieJ M)- 

Proof of Lemma 1. 

{SvU^eJ {SV, - 1)) = {SvU^eJ\J {SV, " 1) U^eJ (^^. ' 1 

= (^t;E5c/v(-l)"^'""'^' U^esSv.U^eJ {^v, - 1)) = 
= E5c/v(-l)"^""'"/' + ^^es Vi) n.eJ {^v, - 1)) 

By (3.3) (note that \J\ — n), the last expression is 



And the lemma follows from the fact that the first term vanishes, since 



Vi\ 



^ ^_lfV\-\s\ = ^l_iyi\J\ = Q 

SCI\J 



because |/ \ J| > 0. 
QED 



Now (3.1) will follow, using (3.3), from the following identity which will be proved later: 
Lemma 2 For any finite set I and vectors v,Vi in a linear space X: 



Scl \ ieS ) acr{I),Ua=I,9^a Aea \ 



SvY[{Svi-l) 

ieA 



- 1 



Proceeding with the proof of (3.1), one has, by (3.2), (3.2') and (3.2"), using (*): 

E5c/(-l)l'l-l^l^(/(^ + E.e/^.)) = 

= ^^/^' (Esci i-ir^-^'^SS {v + E.e5 Vi)) = 

= 5^/^^ (EacP(/),Ua=/,0^a«^^^nAea [SvUieAi^V, - 1)] - 1)) = 

= T,aCP{I),Ua=I,9^a9^ iffi"") X\.Aea [^vX{i^A{SVi - 1)] - Ij) = EacPC/), Ua=/, 0^a -^a- 

Da is an expression of the form appearing in the left-hand side of (3.3), with / replaced 
by 5f, / by a and the vectors v and vihy w = f{v) and wa = f^ {{^v I\i(zA{^Vi — 1)). 

Since / is continuous at x (having a strict 0-derivative) , w f{x) as v ^ x. Also, for 

7^ A c /, 



WA = r {{SvU^eA{Sv, - 1)) = {f^^K^),^^^AV^) + O {U^eA M) = O {U 



ieA \\Vi\ 
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which tends to as v ^ x, Vi ^ 0. 

Now, since g is assumed having strict derivatives at f{x) up to order n , (3.3) and Lemma 
1 apphed to g give: 



if |a| < n 
if \a\ > n 



(3.4) 



Da = g^ {SwllA^ai^WA - 1)) = 

^ I {g^\"Hfi^)), ^Aea^A) + O iUAea \\^a\\) = O (H^ea I^aI 

[ o (n.AeJ ll'^^^ll) <^ C q; with \ J\ — n, 

(All O's and o's are for v ^ x, t'j — > 0). 

In order to deal with the second case in (3.4), note that we have: 

Lemma 3 //Ua = /, then 3/3 C a such that U/3 = / and \(3\ < n. 

Proof Just pick for each i e I an A e a such that i E A and let /3 be the set of A's picked. 
QED 

Thus, if Ua 



I and \a\ > n we can always find a J C a with \J\ = n, UJ = / and 



Da = o{UAeJ W^aW 

Da = 0{UAea I^aI 



o(n 



. If Ua = / and |q;| < n but a is not a partition, then 
, since the A's cover / with redundancy. So we are left 



with the Da's for a e for which we have: 



= /^(l'^l)(/(a;)), ®Aec. [U^^^\\x), iS>ieAVi) + o {UieA H 
= (g^^'^HM): <S)Aea {f^^^Hx),<S>ieAVi)) + o {Uiel \\Vi 
which implies (3.1). 

The only thing left to be done is the: 

Proof of Lemma 2 (i.e. of (*)): 



+ o{n 



= E5c/(-i)'^i-i"i^ 



^vEAcsUieAi^Vi - 1)] = 

+ EacS, A^9^V Ui€A{^Vi 



^ 1) 

= E5C/ {-ly^-^'^SSv UacS, A^0 S {Sv U^eA{Sv^ - 1)) = 

= Esci (-l)l'l-l^l«^^^nAc5, [1 + (S [SvUieAiSVi - 1)] - 1)] = 

= E5c7(-l)'''"'^'«^^^EacP(5),0^anAea [^V UieAiSVi - 1)] - 1) = 

= ^aCViDMa (E5c/,UaC5 (-l)!'!"!^') ^^^^^ HA^a [^vUieAi^Vi - 1)] - 1) 

and (*) follows from the fact that for fixed a 



Scl, LlaCS 

QED 

This completes the proof of Theorem 1 . 



1 if U a = 7 
otherwise 
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